feat(context): make payload-summary char cap configurable per profile by whoabuddy · Pull Request #4 · aibtcdev/agent-runtime

whoabuddy · 2026-05-08T21:30:18Z

Summary

Make the per-task payload-summary character cap configurable via profile.context_policy.max_payload_chars. Default behavior is unchanged — profiles that don't set the field continue to use the existing 4000-char cap (now exported as DEFAULT_MAX_PAYLOAD_CHARS).

Why

The hardcoded MAX_PAYLOAD_CHARS = 4000 inside summarizeJson (src/context.ts:10) truncated task payloads regardless of context_policy.max_prompt_chars. Profiles whose prompt budget has plenty of room (16K–32K) still lost most of their evidence payload because the payload section itself was capped at 4K before assembly.

Concrete case: a fleet-council review profile shipping a curated PR diff in payload.curated_diff_excerpt only saw the first ~3KB of the diff after JSON-stringification, despite max_prompt_chars: 18000 and an assembled prompt totaling ~7KB. The bottleneck was the payload cap, not the prompt cap.

What changes

src/context.ts: rename MAX_PAYLOAD_CHARS to DEFAULT_MAX_PAYLOAD_CHARS and export it; summarizeJson now accepts an optional maxChars parameter; buildPromptText passes profile.context_policy.max_payload_chars through.
src/types.ts: add optional max_payload_chars?: number to Profile.context_policy with a doc comment explaining when to set it.
src/runtime.test.ts: three new tests covering default behavior (unchanged), higher override (lets a previously-truncated payload through), and lower override (truncates aggressively when desired).

Backward compat

Fully backward compatible. The field is optional. Profiles without it use DEFAULT_MAX_PAYLOAD_CHARS = 4000 — identical to the current behavior. No migration required.

Test plan

bun test — 118 pass / 0 fail (3 new tests, no regressions)
bun test -t "payload" — 3 new tests pass

Caller migration (none required)

Profiles that need bigger payload budgets opt in:

{
  "context_policy": {
    "include_recent_task_memory": false,
    "max_prompt_chars": 32000,
    "max_payload_chars": 20000
  }
}

Profiles that don't set the field keep the current 4000-char cap.

🤖 Generated with Claude Code

The hardcoded 4000-char cap inside `summarizeJson` truncated task payloads regardless of the profile's `context_policy.max_prompt_chars`, which meant evidence-heavy review tasks (curated PR diffs, structured audits) lost most of their payload before reaching the model — even on profiles whose prompt budget had room to spare. Add an optional `max_payload_chars` field to `context_policy` and thread it through `buildPromptText` → `summarizeJson`. Default behavior is unchanged: profiles that don't set the field keep the 4000-char cap (now exported as `DEFAULT_MAX_PAYLOAD_CHARS`). Tests cover three cases: default (no override, 4000 cap applies), higher override (12000 lets a 8000-char payload through), and lower override (1000 cap truncates a 2000-char payload).

…it flag, host requirement, write-back guard, idempotency key Addresses 7 inline review findings (Copilot + secret-mars + arc0btc) on PR #5 — all substantive items resolved in-tree, no follow-ups left. **Catch widening (Copilot #4, secret-mars [blocking-risk], arc0btc #1)** `runSubstrateIntakeTick` previously only caught `claimNextJob`. Both `jobRowToTaskInput` and `enqueueTask` ran outside the try, so a malformed JobRow or a local SQLite lock contention escaped as a thrown error from `runOnce`. Each call site now has its own catch with a distinct skip reason — `job-parse-fail` and `local-enqueue-fail` — so the contract's "NEVER throws" guarantee is real. The substrate job stays under lease in both failure modes; `releaseExpiredLeases` reconciles on the next cycle. **`substrateDbInitialized` retry on credential fail (Copilot #1, all 3 reviewers seconded)** The flag was set to `true` BEFORE `createSubstrateConnection` succeeded. A transient credential read miss at first tick (e.g. encrypted blob not yet available) permanently disabled substrate intake for the process lifetime — the contract's documented `[substrate] skip reason=credential-fail` log line would never re-fire. Flag now sets only inside the success branch; next tick retries. **`runSubstrateWriteBack` swallows transient PG blips (arc0btc #3)** Was the only substrate fn whose Postgres calls were unguarded — a mid-write-back connection reset would propagate out of `runOnce` even though the local task already finished cleanly. Wrapped both `completeJob` and `failJob` in a single try; new `[substrate] write-back error=<msg> jobs.id=<id>` log line on transient fail. Lease recovery still reconciles eventually. **Default host removed (Copilot #3, arc0btc seconded)** `host` no longer defaults to a hard-coded private IP. When `substrate.enabled: true`, an explicit `substrate.host` is required — `createSubstrateConnection` throws a clear error if unset. Closes the "dev/test slot misconfigured at enabled:true accidentally connects to prod" footgun. **Self-contained log fallbacks (secret-mars #5)** The empty-else branches on `completeJob`/`failJob` `{ ok: false }` results relied on the substrate-db package emitting its own `[substrate] complete-epoch-mismatch ...` log. If that package's log format ever changes, those branches went silent. Added `[substrate] complete-failed jobs.id=<id> epoch=<n>` and `[substrate] fail-failed ...` fallbacks so this code is self-contained. **Idempotency-key threading (arc0btc #2 design, secret-mars idempotency follow-up)** Closes the "side-effecting jobs can execute twice when a lease expires mid-flight" hazard structurally — the fence on `jobs.claim_epoch` only protects the status write, not the action. `jobRowToTaskInput` now threads `payload.idempotency_key = "substrate-<job_id>-e<claim_epoch>"` so downstream side-effecting handlers (email send, PR open, tx broadcast) can dedup against their own per-handler key store. Bounds the *consequence* from "action runs twice" to "action runs once, second attempt no-ops." Substrate tasks also enqueue with `priority: 1` so they execute on the same or next tick — the lease window now tracks real execution latency instead of waiting behind lower-priority work, which shrinks the "lease expires while task waits in local queue" window further. **Silent null claim (Copilot #5)** Contract said null-claim is silent; impl logged `[substrate] claim ... result=none` every tick. Dropped the log — quiet-tick visibility lives in successful-claim and idle-dispatch event lines, not substrate tick-rate noise. **Other nits (Copilot #2, arc0btc #5)** - Removed unused `getTaskById` import in `substrate.ts`. - Documented `substrate` block's shallow-merge behavior in `types.ts` (unlike `profiles`/`adapters` which deep-merge). Slots that extend a base and want to flip only `isLeaseRecoveryOwner: true` must repeat the whole substrate block. Deferred (already declared out of scope by reviewers): - `pg_advisory_lock` guard on `isLeaseRecoveryOwner` race (secret-mars [suggestion]) — multi-slot mis-config protection is a follow-up. - `_substrate_*` payload-injection vs strict validators (secret-mars [suggestion]) — naming is defensive prefixing already. Contract block in src/substrate.ts updated with all new log lines and a new "Side-effect duplicate-execution guard" section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…l/lease-recovery paths Closes the test-gap finding from arc0btc #4 and secret-mars: prior suite covered only `resolveSubstrateConfig`, `jobRowToTaskInput`, and two write-back *early-return* guards via a fakeDb that threw if touched. Claim, complete, fail, epoch-mismatch no-op, transient write-back throw, and lease recovery — the "highest correctness surface" per the PR's own Commit-2 description — were untested. Rewires the file to use `mock.module("@genesis-works/substrate-db", ...)` with mock fns per substrate-db export. Local DB is a real `openDb(:memory:)` so `enqueueTask` exercises the actual sqlite write path. console.{info,error,warn} are spied per-test for log-line assertions. New tests (18, alongside 10 existing for 28 total): - `runSubstrateIntakeTick`: - happy claim → enqueues local task, returns `claimed=true` with correct epoch + jobId, emits `[substrate] claim ...` log - null claim → silent (no `[substrate] claim` log line; just `claimed: false`) - db-unreachable → catches `claimNextJob` throw, logs `reason=db-unreachable` - local-enqueue-fail → forces `enqueueTask` to throw by closing the local db, asserts catch fires + log line includes `reason=local-enqueue-fail jobs.id=<id>` (proves Copilot #4 / arc0btc #1 fix is wired) - `runSubstrateWriteBack`: - non-substrate source no-op (existing — kept) - missing `_substrate_job_id` no-op (existing — kept) - complete happy → calls `completeJob` with `claim_epoch` arg, `[substrate] complete jobs.id=... epoch=...` log - complete `{ ok: false }` → emits self-contained fallback `[substrate] complete-failed` log (proves secret-mars #5 fix) - complete throws → catches transient PG blip, emits `[substrate] write-back error=... jobs.id=...` (proves arc0btc #3 fix is wired — write-back no longer propagates out of `runOnce`) - fail happy → calls `failJob`, logs reason snippet - fail `{ ok: false }` → emits `[substrate] fail-failed` fallback - `runSubstrateLeaseRecovery`: - released > 0 → logs `[substrate] lease-recovery released=<n>` - released = 0 → no log (quiet success) - throws → catches, logs `[substrate] lease-recovery error=...`, does NOT propagate - `createSubstrateConnection`: - throws when `substrate.host` is unset (proves Copilot #3 fix — no implicit default to a hard-coded private IP) - throws on whitespace-only host - `jobRowToTaskInput`: - asserts `payload.idempotency_key = "substrate-<job_id>-e<epoch>"` (proves idempotency-key threading is wired) - asserts `priority = 1` - asserts `_substrate_*` payload fields threaded - asserts `max_attempts = 1` (substrate handles retry) - existing source / subject-fallback tests kept Header comment rewritten to match what's actually covered (Copilot #7). Unused `mock`/`beforeEach`/`afterEach` imports replaced with the ones actually used (Copilot #6 — though now they ARE used, so the prior nit naturally resolves). Test count: 143 total (28 substrate + 115 existing) — up from 125. `bunx tsc --noEmit` clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(substrate): opt-in substrate intake — config, credential, claim, enqueue Adds @genesis-works/substrate-db as a dependency and wires a substrate dispatch intake path into the runOnce tick. Opt-in per slot: substrate is disabled by default (config.substrate.enabled=false). Flipping this flag live changes zero behavior on any slot that has not set it. Config keys (host config substrate block): substrate.enabled — bool, default false substrate.credential — string, credential id (NEVER plaintext password) substrate.kinds — string[], job kinds this slot handles substrate.slotId — string, this slot's claim identifier substrate.leaseSecs — number, default 300 (5 minutes) substrate.host/port/database/user — Postgres connection params On each tick (when enabled): 1. Credential resolved via existing encrypted-blob pattern. 2. claimNextJob(slotId, kinds) called against shared Postgres. Log: "[substrate] claim slot=<id> kinds=<list> result=<jobs.id|none>" 3. Claimed job converted to TaskInput (source="substrate:<jobId>"). 4. TaskInput enqueued as a local runtime task for next-tick execution. Failure conditions (all logged, none crash): - substrate unreachable → "[substrate] skip reason=db-unreachable error=..." - credential fail → "[substrate] skip reason=credential-fail error=..." - claimNextJob null → "[substrate] claim ... result=none" (silent no-op) Pinned substrate-db SHA: d458200b008efe8be3d37912a71b990d73ff2b17 (arc0btc/substrate-db — private, deployed alongside agent-runtime on slots) Co-Authored-By: Claude <noreply@anthropic.com> * feat(substrate): lease semantics — epoch-fenced write-back, max_attempts, lease recovery Adds the epoch-fenced write-back hook and lease recovery cadence to the substrate intake adapter. Write-back (after finalizeTaskAttempt): - If task.source starts with "substrate:", calls runSubstrateWriteBack. - Reads _substrate_job_id and _substrate_claim_epoch from task payload. - Calls completeJob(epoch) on success, failJob(epoch) on failure. - Epoch mismatch (stale executor woke after lease recovery) → logged no-op: "[substrate] complete-epoch-mismatch ... stale executor, ignored" - Pre-fencing row (epoch=0, expected>0) → distinct log line: "pre-fencing row, ignored" - completeJob/failJob return WriteBackResult — conflict is logged, not thrown. Lease recovery: - Runs releaseExpiredLeases on the substrate DB when isLeaseRecoveryOwner=true. - Cadence: leaseRecoveryCadenceSecs (default 60s) using a module-level timer. - ONE nominated owner runs this — not all slots — to avoid stampede. - Log: "[substrate] lease-recovery released=<n>" - Lease expiry measured against Postgres now() — NTP drift irrelevant. Log lines contract (stable, parseable): [substrate] claim slot=<id> kinds=<list> result=<jobs.id|none> [substrate] complete jobs.id=<id> epoch=<n> [substrate] fail jobs.id=<id> epoch=<n> reason=<...> [substrate] lease-recovery released=<n> [substrate] skip reason=credential-fail error=<...> [substrate] skip reason=db-unreachable error=<...> [substrate] complete-epoch-mismatch jobs.id=<id> expected=<n> actual=<m> — ... Tests: 125/125 pass (10 new substrate tests + 115 existing). Co-Authored-By: Claude <noreply@anthropic.com> * fix(substrate): review-feedback correctness pass — catch widening, init flag, host requirement, write-back guard, idempotency key Addresses 7 inline review findings (Copilot + secret-mars + arc0btc) on PR #5 — all substantive items resolved in-tree, no follow-ups left. **Catch widening (Copilot #4, secret-mars [blocking-risk], arc0btc #1)** `runSubstrateIntakeTick` previously only caught `claimNextJob`. Both `jobRowToTaskInput` and `enqueueTask` ran outside the try, so a malformed JobRow or a local SQLite lock contention escaped as a thrown error from `runOnce`. Each call site now has its own catch with a distinct skip reason — `job-parse-fail` and `local-enqueue-fail` — so the contract's "NEVER throws" guarantee is real. The substrate job stays under lease in both failure modes; `releaseExpiredLeases` reconciles on the next cycle. **`substrateDbInitialized` retry on credential fail (Copilot #1, all 3 reviewers seconded)** The flag was set to `true` BEFORE `createSubstrateConnection` succeeded. A transient credential read miss at first tick (e.g. encrypted blob not yet available) permanently disabled substrate intake for the process lifetime — the contract's documented `[substrate] skip reason=credential-fail` log line would never re-fire. Flag now sets only inside the success branch; next tick retries. **`runSubstrateWriteBack` swallows transient PG blips (arc0btc #3)** Was the only substrate fn whose Postgres calls were unguarded — a mid-write-back connection reset would propagate out of `runOnce` even though the local task already finished cleanly. Wrapped both `completeJob` and `failJob` in a single try; new `[substrate] write-back error=<msg> jobs.id=<id>` log line on transient fail. Lease recovery still reconciles eventually. **Default host removed (Copilot #3, arc0btc seconded)** `host` no longer defaults to a hard-coded private IP. When `substrate.enabled: true`, an explicit `substrate.host` is required — `createSubstrateConnection` throws a clear error if unset. Closes the "dev/test slot misconfigured at enabled:true accidentally connects to prod" footgun. **Self-contained log fallbacks (secret-mars #5)** The empty-else branches on `completeJob`/`failJob` `{ ok: false }` results relied on the substrate-db package emitting its own `[substrate] complete-epoch-mismatch ...` log. If that package's log format ever changes, those branches went silent. Added `[substrate] complete-failed jobs.id=<id> epoch=<n>` and `[substrate] fail-failed ...` fallbacks so this code is self-contained. **Idempotency-key threading (arc0btc #2 design, secret-mars idempotency follow-up)** Closes the "side-effecting jobs can execute twice when a lease expires mid-flight" hazard structurally — the fence on `jobs.claim_epoch` only protects the status write, not the action. `jobRowToTaskInput` now threads `payload.idempotency_key = "substrate-<job_id>-e<claim_epoch>"` so downstream side-effecting handlers (email send, PR open, tx broadcast) can dedup against their own per-handler key store. Bounds the *consequence* from "action runs twice" to "action runs once, second attempt no-ops." Substrate tasks also enqueue with `priority: 1` so they execute on the same or next tick — the lease window now tracks real execution latency instead of waiting behind lower-priority work, which shrinks the "lease expires while task waits in local queue" window further. **Silent null claim (Copilot #5)** Contract said null-claim is silent; impl logged `[substrate] claim ... result=none` every tick. Dropped the log — quiet-tick visibility lives in successful-claim and idle-dispatch event lines, not substrate tick-rate noise. **Other nits (Copilot #2, arc0btc #5)** - Removed unused `getTaskById` import in `substrate.ts`. - Documented `substrate` block's shallow-merge behavior in `types.ts` (unlike `profiles`/`adapters` which deep-merge). Slots that extend a base and want to flip only `isLeaseRecoveryOwner: true` must repeat the whole substrate block. Deferred (already declared out of scope by reviewers): - `pg_advisory_lock` guard on `isLeaseRecoveryOwner` race (secret-mars [suggestion]) — multi-slot mis-config protection is a follow-up. - `_substrate_*` payload-injection vs strict validators (secret-mars [suggestion]) — naming is defensive prefixing already. Contract block in src/substrate.ts updated with all new log lines and a new "Side-effect duplicate-execution guard" section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(substrate): expand coverage via mock.module — claim/complete/fail/lease-recovery paths Closes the test-gap finding from arc0btc #4 and secret-mars: prior suite covered only `resolveSubstrateConfig`, `jobRowToTaskInput`, and two write-back *early-return* guards via a fakeDb that threw if touched. Claim, complete, fail, epoch-mismatch no-op, transient write-back throw, and lease recovery — the "highest correctness surface" per the PR's own Commit-2 description — were untested. Rewires the file to use `mock.module("@genesis-works/substrate-db", ...)` with mock fns per substrate-db export. Local DB is a real `openDb(:memory:)` so `enqueueTask` exercises the actual sqlite write path. console.{info,error,warn} are spied per-test for log-line assertions. New tests (18, alongside 10 existing for 28 total): - `runSubstrateIntakeTick`: - happy claim → enqueues local task, returns `claimed=true` with correct epoch + jobId, emits `[substrate] claim ...` log - null claim → silent (no `[substrate] claim` log line; just `claimed: false`) - db-unreachable → catches `claimNextJob` throw, logs `reason=db-unreachable` - local-enqueue-fail → forces `enqueueTask` to throw by closing the local db, asserts catch fires + log line includes `reason=local-enqueue-fail jobs.id=<id>` (proves Copilot #4 / arc0btc #1 fix is wired) - `runSubstrateWriteBack`: - non-substrate source no-op (existing — kept) - missing `_substrate_job_id` no-op (existing — kept) - complete happy → calls `completeJob` with `claim_epoch` arg, `[substrate] complete jobs.id=... epoch=...` log - complete `{ ok: false }` → emits self-contained fallback `[substrate] complete-failed` log (proves secret-mars #5 fix) - complete throws → catches transient PG blip, emits `[substrate] write-back error=... jobs.id=...` (proves arc0btc #3 fix is wired — write-back no longer propagates out of `runOnce`) - fail happy → calls `failJob`, logs reason snippet - fail `{ ok: false }` → emits `[substrate] fail-failed` fallback - `runSubstrateLeaseRecovery`: - released > 0 → logs `[substrate] lease-recovery released=<n>` - released = 0 → no log (quiet success) - throws → catches, logs `[substrate] lease-recovery error=...`, does NOT propagate - `createSubstrateConnection`: - throws when `substrate.host` is unset (proves Copilot #3 fix — no implicit default to a hard-coded private IP) - throws on whitespace-only host - `jobRowToTaskInput`: - asserts `payload.idempotency_key = "substrate-<job_id>-e<epoch>"` (proves idempotency-key threading is wired) - asserts `priority = 1` - asserts `_substrate_*` payload fields threaded - asserts `max_attempts = 1` (substrate handles retry) - existing source / subject-fallback tests kept Header comment rewritten to match what's actually covered (Copilot #7). Unused `mock`/`beforeEach`/`afterEach` imports replaced with the ones actually used (Copilot #6 — though now they ARE used, so the prior nit naturally resolves). Test count: 143 total (28 substrate + 115 existing) — up from 125. `bunx tsc --noEmit` clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Jason Schrader <jason@joinfreehold.com> Co-authored-by: Claude <noreply@anthropic.com>

secret-mars mentioned this pull request May 30, 2026

feat(substrate): opt-in substrate dispatch intake (Phase 5) #5

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(context): make payload-summary char cap configurable per profile#4

feat(context): make payload-summary char cap configurable per profile#4
whoabuddy wants to merge 1 commit into
mainfrom
feat/configurable-max-payload-chars

whoabuddy commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

whoabuddy commented May 8, 2026

Summary

Why

What changes

Backward compat

Test plan

Caller migration (none required)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant